AITopics | dominant modality

Collaborating Authors

dominant modality

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Modality-Aware SAM: Sharpness-Aware-Minimization Driven Gradient Modulation for Harmonized Multimodal Learning

Neural Information Processing SystemsJun-23-2026, 03:21:02 GMT

We propose Modality-Aware Sharpness-Aware Minimization (MSAM), a model-agnostic framework that applies to many modalities and supports early and late fusion scenarios.

artificial intelligence, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Government (0.46)
Health & Medicine (0.46)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Adaptive Re-calibration Learning for Balanced Multimodal Intention Recognition

Neural Information Processing SystemsJun-22-2026, 19:01:09 GMT

Multimodal Intention Recognition (MIR) plays a critical role in applications such as intelligent assistants, service robots, and autonomous systems. However, in realworld scenarios, different modalities often vary significantly in informativeness, reliability, and noise levels. This leads to modality imbalance, where models tend to over-rely on dominant modalities, thereby limiting generalization and robustness. Although existing methods address this issue at either the sample or model level, they generally fail to account for its multi-level nature. To address this, we propose Adaptive Re-calibration Learning (ARL), a novel dual-path framework that models modality importance from both sample-wise and structural perspectives. ARL incorporates two key mechanisms: Contribution-Inverse Sample Calibration (CISC), which dynamically masks overly dominant modalities at the sample level to encourage attention to underutilized ones; and Weighted Encoder Calibration (WEC), which adjusts encoder weights based on global modality contributions to prevent overfitting. Experimental results on multiple MIR benchmarks demonstrate that ARL significantly outperforms existing methods in both accuracy and robustness, particularly under noisy or modality-degraded conditions.

artificial intelligence, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country: Asia > China (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Vision (0.94)
(2 more...)

Add feedback

Diversity-oriented Deep Multi-modal Clustering

Neural Information Processing SystemsJun-20-2026, 13:13:00 GMT

Deep multi-modal clustering (DMC) aims to explore the correlated information from different modalities to improve the clustering performance. Most existing DMCs attempt to investigate the consistency or/and complementarity information by fusing all modalities, but this will lead to the following challenges: 1) Information conflicts between modalities emerge.

data mining, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: Asia > China (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Information Technology (1.00)
Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
(3 more...)

Add feedback

Question Describe the given video and audio in detail

Neural Information Processing SystemsJun-17-2026, 13:27:27 GMT

Hallucination remains a major challenge in multimodal large language models (MLLMs). To address this, various contrastive decoding (CD) methods have been proposed that contrasts original logits with hallucinated logits generated from perturbed inputs. While CD has shown promise in vision-language models (VLMs), it is not well-suited for AV-LLMs, where hallucinations often emerge from both unimodal and cross-modal combinations involving audio, video, and language. These intricate interactions call for a more adaptive and modality-aware decoding strategy. In this paper, we propose Audio-Visual Contrastive Decoding (AVCD)--a novel, training-free decoding framework designed to model trimodal interactions and suppress modality-induced hallucinations in AV-LLMs.

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Modality-Aware SAM: Sharpness-Aware-Minimization Driven Gradient Modulation for Harmonized Multimodal Learning

Neural Information Processing SystemsJun-14-2026, 07:42:25 GMT

We propose Modality-Aware Sharpness-Aware Minimization (M-SAM), a model-agnostic framework that applies to many modalities and supports early and late fusion scenarios.

artificial intelligence, machine learning, proceedings, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.38)

Add feedback

Adaptive Re-calibration Learning for Balanced Multimodal Intention Recognition

Neural Information Processing SystemsJun-14-2026, 04:22:57 GMT

Multimodal Intention Recognition (MIR) plays a critical role in applications such as intelligent assistants, service robots, and autonomous systems. However, in real-world settings, different modalities often vary significantly in informativeness, reliability, and noise levels. This leads to modality imbalance, where models tend to over-rely on dominant modalities, thereby limiting generalization and robustness. While existing methods attempt to alleviate this issue at either the sample or model level, most overlook its multi-level nature. To address this, we propose Adaptive Re-calibration Learning (ARL), a novel dual-path framework that models modality importance from both sample-wise and structural perspectives. ARL incorporates two key mechanisms: Contribution-Inverse Sample Calibration (CISC), which dynamically masks overly dominant modalities at the sample level to encourage attention to underutilized ones; and Weighted Encoder Calibration (WEC), which adjusts encoder weights based on global modality contributions to prevent overfitting. Experimental results on multiple MIR benchmarks demonstrate that ARL significantly outperforms existing methods in both accuracy and robustness, particularly under noisy or modality-degraded conditions.

artificial intelligence, machine learning, proceedings, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Robots (0.60)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.60)
Information Technology > Artificial Intelligence > Machine Learning (0.40)

Add feedback

f0c68d99827dc09ed28aa073455efcbe-Paper-Conference.pdf

Neural Information Processing SystemsFeb-18-2026, 15:56:09 GMT

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

Europe > Italy > Tuscany > Florence (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Towards Robust Multimodal Sentiment Analysis with Incomplete Data

Neural Information Processing SystemsFeb-15-2026, 12:18:21 GMT

Recognizing that the language modality typically contains dense sentiment information, we consider it as the dominant modality and present an innovative Language-dominated Noise-resistant Learning Network (LNLN) to achieve robust MSA.

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

Asia > China > Hubei Province > Wuhan (0.04)
Asia > China > Hong Kong (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.41)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.41)

Add feedback

Modality-Aware SAM: Sharpness-Aware-Minimization Driven Gradient Modulation for Harmonized Multimodal Learning

Nowdeh, Hossein R., Ji, Jie, Ma, Xiaolong, Afghah, Fatemeh

arXiv.org Artificial IntelligenceOct-30-2025

In multimodal learning, dominant modalities often overshadow others, limiting generalization. We propose Modality-Aware Sharpness-Aware Minimization (M-SAM), a model-agnostic framework that applies to many modalities and supports early and late fusion scenarios. In every iteration, M-SAM in three steps optimizes learning. \textbf{First, it identifies the dominant modality} based on modalities' contribution in the accuracy using Shapley. \textbf{Second, it decomposes the loss landscape}, or in another language, it modulates the loss to prioritize the robustness of the model in favor of the dominant modality, and \textbf{third, M-SAM updates the weights} by backpropagation of modulated gradients. This ensures robust learning for the dominant modality while enhancing contributions from others, allowing the model to explore and exploit complementary features that strengthen overall performance. Extensive experiments on four diverse datasets show that M-SAM outperforms the latest state-of-the-art optimization and gradient manipulation methods and significantly balances and improves multimodal learning.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2510.24919

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Multimodal Negative Learning

Gong, Baoquan, Gao, Xiyuan, Zhu, Pengfei, Hu, Qinghua, Cao, Bing

arXiv.org Artificial IntelligenceOct-27-2025

Multimodal learning systems often encounter challenges related to modality imbalance, where a dominant modality may overshadow others, thereby hindering the learning of weak modalities. Conventional approaches often force weak modalities to align with dominant ones in "Learning to be (the same)" (Positive Learning), which risks suppressing the unique information inherent in the weak modalities. To address this challenge, we offer a new learning paradigm: "Learning Not to be" (Negative Learning). Instead of enhancing weak modalities' target-class predictions, the dominant modalities dynamically guide the weak modality to suppress non-target classes. This stabilizes the decision space and preserves modality-specific information, allowing weak modalities to preserve unique information without being over-aligned. We proceed to reveal multimodal learning from a robustness perspective and theoretically derive the Multimodal Negative Learning (MNL) framework, which introduces a dynamic guidance mechanism tailored for negative learning. Our method provably tightens the robustness lower bound of multimodal learning by increasing the Unimodal Confidence Margin (UCoM) and reduces the empirical error of weak modalities, particularly under noisy and imbalanced scenarios. Extensive experiments across multiple benchmarks demonstrate the effectiveness and generalizability of our approach against competing methods. The code will be available at https://github.com/BaoquanGong/Multimodal-Negative-Learning.git.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2510.20877

Country: North America > United States (0.28)

Genre: